29 research outputs found

    Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

    Full text link
    Building agents using large language models (LLMs) to control computers is an emerging research field, where the agent perceives computer states and performs actions to accomplish complex tasks. Previous computer agents have demonstrated the benefits of in-context learning (ICL); however, their performance is hindered by several issues. First, the limited context length of LLMs and complex computer states restrict the number of exemplars, as a single webpage can consume the entire context. Second, the exemplars in current methods, such as high-level plans and multi-choice questions, cannot represent complete trajectories, leading to suboptimal performance in tasks that require many steps or repeated actions. Third, existing computer agents rely on task-specific exemplars and overlook the similarity among tasks, resulting in poor generalization to novel tasks. To address these challenges, we introduce Synapse, featuring three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions for improved multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks. We evaluate Synapse on MiniWoB++, a standard task suite, and Mind2Web, a real-world website benchmark. In MiniWoB++, Synapse achieves a 99.2% average success rate (a 10% relative improvement) across 64 tasks using demonstrations from only 48 tasks. Notably, Synapse is the first ICL method to solve the book-flight task in MiniWoB++. Synapse also exhibits a 53% relative improvement in average step success rate over the previous state-of-the-art prompting scheme in Mind2Web.Comment: 22 pages, 7 figure

    Market-GAN: Adding Control to Financial Market Data Generation with Semantic Context

    Full text link
    Financial simulators play an important role in enhancing forecasting accuracy, managing risks, and fostering strategic financial decision-making. Despite the development of financial market simulation methodologies, existing frameworks often struggle with adapting to specialized simulation context. We pinpoint the challenges as i) current financial datasets do not contain context labels; ii) current techniques are not designed to generate financial data with context as control, which demands greater precision compared to other modalities; iii) the inherent difficulties in generating context-aligned, high-fidelity data given the non-stationary, noisy nature of financial data. To address these challenges, our contributions are: i) we proposed the Contextual Market Dataset with market dynamics, stock ticker, and history state as context, leveraging a market dynamics modeling method that combines linear regression and Dynamic Time Warping clustering to extract market dynamics; ii) we present Market-GAN, a novel architecture incorporating a Generative Adversarial Networks (GAN) for the controllable generation with context, an autoencoder for learning low-dimension features, and supervisors for knowledge transfer; iii) we introduce a two-stage training scheme to ensure that Market-GAN captures the intrinsic market distribution with multiple objectives. In the pertaining stage, with the use of the autoencoder and supervisors, we prepare the generator with a better initialization for the adversarial training stage. We propose a set of holistic evaluation metrics that consider alignment, fidelity, data usability on downstream tasks, and market facts. We evaluate Market-GAN with the Dow Jones Industrial Average data from 2000 to 2023 and showcase superior performance in comparison to 4 state-of-the-art time-series generative models

    Offline Equilibrium Finding

    Full text link
    Offline reinforcement learning (Offline RL) is an emerging field that has recently begun gaining attention across various application domains due to its ability to learn behavior from earlier collected datasets. Using logged data is imperative when further interaction with the environment is expensive (computationally or otherwise), unsafe, or entirely unfeasible. Offline RL proved very successful, paving a path to solving previously intractable real-world problems, and we aim to generalize this paradigm to a multi-agent or multiplayer-game setting. Very little research has been done in this area, as the progress is hindered by the lack of standardized datasets and meaningful benchmarks. In this work, we coin the term offline equilibrium finding (OEF) to describe this area and construct multiple datasets consisting of strategies collected across a wide range of games using several established methods. We also propose a benchmark method -- an amalgamation of a behavior-cloning and a model-based algorithm. Our two model-based algorithms -- OEF-PSRO and OEF-CFR -- are adaptations of the widely-used equilibrium finding algorithms Deep CFR and PSRO in the context of offline learning. In the empirical part, we evaluate the performance of the benchmark algorithms on the constructed datasets. We hope that our efforts may help to accelerate research in large-scale equilibrium finding. Datasets and code are available at https://github.com/SecurityGames/oef

    EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading

    Full text link
    High-frequency trading (HFT) uses computer algorithms to make trading decisions in short time scales (e.g., second-level), which is widely used in the Cryptocurrency (Crypto) market (e.g., Bitcoin). Reinforcement learning (RL) in financial research has shown stellar performance on many quantitative trading tasks. However, most methods focus on low-frequency trading, e.g., day-level, which cannot be directly applied to HFT because of two challenges. First, RL for HFT involves dealing with extremely long trajectories (e.g., 2.4 million steps per month), which is hard to optimize and evaluate. Second, the dramatic price fluctuations and market trend changes of Crypto make existing algorithms fail to maintain satisfactory performance. To tackle these challenges, we propose an Efficient hieArchical Reinforcement learNing method for High Frequency Trading (EarnHFT), a novel three-stage hierarchical RL framework for HFT. In stage I, we compute a Q-teacher, i.e., the optimal action value based on dynamic programming, for enhancing the performance and training efficiency of second-level RL agents. In stage II, we construct a pool of diverse RL agents for different market trends, distinguished by return rates, where hundreds of RL agents are trained with different preferences of return rates and only a tiny fraction of them will be selected into the pool based on their profitability. In stage III, we train a minute-level router which dynamically picks a second-level agent from the pool to achieve stable performance across different markets. Through extensive experiments in various market trends on Crypto markets in a high-fidelity simulation trading environment, we demonstrate that EarnHFT significantly outperforms 6 state-of-art baselines in 6 popular financial criteria, exceeding the runner-up by 30% in profitability

    Annealing tunable charge density wave order in a magnetic kagome material FeGe

    Full text link
    In the magnetic kagome metal FeGe, a charge density wave (CDW) order emerges inside the antiferromagnetic phase, providing a fertile playground to investigate the interplay between charge and magnetic orders. Here, we demonstrate that the CDW order, as well as magnetic properties, can be reversibly tuned on a large scale through post-growth annealing treatments. The antiferromagnetic and CDW transitions vary systematically as functions of both the temperature and the time period of annealing. Long-range CDW order with a maximum TCDWT_{\mathrm{CDW}} and a minimum TNT_{\mathrm{N}} can be realized in crystals annealed at \SI{320}{\degreeCelsius} for over 48 h. Using magnetization and magnetostrictive coefficient measurements, it is found that the CDW transition is rather stable against an external magnetic field and spin-flop transition. On the other hand, the critical field for spin-flop transition is significantly reduced in the long-range ordered CDW phase. Our results indicate that the CDW in FeGe is immune to variations in magnetic orders, while the magnetocrystalline anisotropy energy and the corresponding magnetic ground state can be altered significantly by the charge order. These findings provide crucial clues for further investigation and a better understanding of the nature of the CDW order in FeGe.Comment: 8 pages, 4 figure

    Angle dependent field-driven reorientation transitions in uniaxial antiferromagnet MnBi2_2Te4_4 single crystal

    Full text link
    MnBi2_2Te4_4, a two-dimensional magnetic topological insulator with a uniaxial antiferromagnetic structure, is an ideal platform to realize quantum anomalous Hall effect. However, the strength of magnetic interactions is not clear yet. We performed systematic studies on the magnetization and angle dependent magnetotransport of MnBi2_2Te4_4 single crystal. The results show that the direction of the magnetic field has significant effects on the critical field values and magnetic structure of this compound, which leads to different magnetotransport behaviors. The field-driven reorientation transitions can be utilized to estimate the AFM interlayer exchange interaction coupling and uniaxial magnetic anisotropy D. The obtained Hamiltonian can well explain the experimental data by Monte Carlo simulations. Our comprehensive studies on the field-driven magnetic transitions phenomenon in MnBi2_2Te4_4 provide a general approach for other topological systems with antiferromagnetism.Comment: 6 figure

    Multiband effects in thermoelectric and electrical transport properties of kagome superconductors AAV3_3Sb5_5 (AA = K, Rb, Cs)

    Full text link
    We studied the effects of multiband electronic structure on the thermoelectric and electrical transport properties in the normal state of kagome superconductors AAV3_3Sb5_5 (AA = K, Rb, Cs). In all three members, the multiband nature is manifested by sign changes in the temperature dependence of the Seebeck and Hall resistivity, together with sublinear response of the isothermal Nernst and Hall effects to external magnetic fields in the charge ordered state. Moreover, ambipolar transport effects appear ubiquitously in all three systems, giving rise to sizable Nernst signal. Finally, possible origins of the sign reversal in the temperature dependence of the Hall effect are discussed.Comment: 8 pages, 5 figures. To appear in New Journal of Physic
    corecore